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Abstract 

We characterize the invariant filtering measures resulting from Kalman filtering with intermittent observations 
([1]), where the observation arrival is modeled as a Bernoulli process. In [2], it was shown that there exists a > 
such that for every observation packet arrival probability 7, 7 > 7 sb > 0, the sequence of random conditional 
error covariance matrices converges in distribution to a unique invariant distribution n 1 (independent of the filter 
initialization.) In this paper, we prove that, for controllable and observable systems, 7 sb = and that, as 7 f 1, the 
family {/x*}=y>o of invariant distributions satisfies a moderate deviations principle (MDP) with a good rate function 
I. The rate function I is explicitly identified. In particular, our results show: (1) as 7 f 1, the family {/i 7 } converges 
weakly (in distribution) to the Dirac measure dp*, where P* is the fixed point of the discrete time Riccati operator; 
(2) the probability of a rare event (an event bounded away from P*) under p 7 decays to zero as a power law of (1 — 7) 
as 7 f 1. The best exponent of such a power law decay is explicitly obtained by solving a deterministic variational 
problem involving the MDP rate function I. These results offer a complete characterization of the family of invariant 
distributions {^i 7 } 7 >o.We provide computationally efficient methods for solving the variational problems in question, 
leading to efficient estimates of probabilities under the invariant measures. The analytical techniques developed in 
this paper are fairly general and applicable to the analysis of a broader class of iterated function systems. Several 
intermediate results obtained in the process are of independent interest. 

1. Introduction 

A. Background and Motivation 

Kalman filtering with non-classical information pattern has received significant attention in the control and 
signal processing literature. There has been renewed interest, motivated by increasing real-time networked systems 
applications. Such networks operate under constrained resources with lack of supervised control centers leading 
to inherent sources of randomness in the information pattern. For reliable system operation, it is of interest to 
understand the asymptotic properties of such systems like stability and ergodicity. In [2], we studied this problem 
in the context of Kalman filtering with intermittent observations ([1].) The results in [2] establish an interesting 
dichotomy for the filtering error process and show, in particular, that stochastic boundedness of the sequence of 
conditional error covariance matrices (generated by the discrete time random Riccati equation (RRE)) is necessary 
and sufficient for its ergodicity. In other words, we showed the existence of a critical probability, 7 sb , such that, if 
the observation packet arrival probability 7 is greater than 7 sb , the sequence of random error covariance matrices 
converges weakly (in distribution) to a unique invariant distribution fi~<. We note here that stochastic boundedness is 
a much weaker condition than moment stability, and, as shown in [2], convergence to a unique invariant distribution 
is possible under a packet arrival probability for which moment stability does not hold. In this context, we further 
note, that our work ([2]) provides a sample-path analysis of the RRE, in contrast to moment stability analysis, as 
is done conventionally in the literature (see, for example, [3], [4], [5], [6], [7], [8], [9], [10], [11] and also [2] for 
a detailed review of the literature.) 
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To summarize, the results in [2] showed the existence and uniqueness of an attracting invariant measure /i 7 for 
the RRE, for every 7 > 7 sb , to which the conditional error covariance matrices converge weakly when operated 
at packet arrival probability 7. In this paper, we prove that, for observable and controllable systems, 7 sb = 0, and 
hence [i 1 exists and is unique for every 7 > for such systems 1 . The main goal of this paper is to undertake the 
highly nontrivial problem of characterizing the resulting invariant measures fi 1 . In the non-classical information 
case, characterization of the steady-state error covariance distribution fi 7 is as important as characterizing the 
deterministic fixed point of the Riccati equation in the classical case, as derived by Kalman [12] for discrete time 
and by Kalman and Bucy [13] for continuous time. 

We detail the key contributions of this paper. We show the following for observable and controllable systems. 

1. Stochastic boundedness: 7 sb = 0. We prove the result stated in Theorem 9 in [2], by proving that, for controllable 
and observable systems, 7 sb = 0, and so, for any non-zero observation arrival probability 7, the conditional error 
covariance process is ergodic, with the unique attracting measure ^ 7 . 

2. Moderate deviation principle (MDP). We show that the family of invariant distributions {^ 7 }_ >0 satisfies a 
moderate deviations principle (MDP) with good rate function I as 7 t 1. An immediate consequence (which is 
rather intuitive but not natural) is that, as 7 t 1, the family of invariant measures {/x 7 } converges weakly to the 
Dirac mass 5p», where P* is the unique fixed point of the deterministic Riccati equation. 

3. Probability of rare events. The MDP implies that the probabilities of 'rare events' (events bounded away from 
P*) decay to zero as 7 t 1. A natural question of practical and theoretical interest is the rate at which the probability 
of such a rare event goes to zero. We show that the probability of such rare events decays as a power law of (1 — 7) 
as 7 t 1. 

4. Best decay: Variational problem. The best exponent for the power law decay of the probability of rare events 
depends on the rare event of interest; it can be explicitly characterized in terms of the rate function /(•). Formally, 
we have the following MDP asymptotics: 

^(T) ~ (1 - 7) inf -er /(*) (1) 

(this notation is made precise in the paper.) Thus, the exact decay asymptotics of a rare event is obtained by solving 
a variational problem involving the rate function /. Since, the above MDP asymptotics holds for every Borel set 
r, our result characterizes completely the family of invariant measures { /i 7 } • 

5. Estimating the probability of rare events. The estimation of probabilities of rare events reduces to solving 
deterministic variational problems; this not only characterizes the decay rate of rare events but also gives insight 
into how such events occur. In Section 8, we show several techniques that can be employed to solve these variational 
problems efficiently. We emphasize that our analysis of reducing the problem of estimating probabilities of interest to 
solving variational problems efficiently is much more definitive and relevant than numerically estimating the invariant 
distributions. A naive numerical approach of simulating the distributions {/z 7 } as 7 t 1 becomes meaningless as the 
rare events of interest become increasingly difficult to observe as 7 t 1 (see also Section 9.) One may take recourse 
to sophisticated simulation techniques like importance sampling (see, for example, [14]), but such approaches require 
characterization of the distributions in question, which is addressed in this paper. 

More broadly, the techniques developed in this paper are fairly general and go beyond the setting of Kalman 

'The fact, that 7 sb = 0, was proved for systems with invertible observation matrices in [2]. The proof for general observable systems is 
provided in Appendix B of the present paper. 
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filtering with intermittent observations. There is a key difference in the MDP arguments used here and conventional 
methods for analyzing the moderate (or large) deviations for stationary measures of Markov processes, where it is 
generally assumed that the underlying Markov process is positive recurrent and moderate deviations of stationary 
measures then follow from that of finite dimensional distributions (see, for example, [15], [16].) However, the 
Markov processes governing the RRE are not, in general, positive recurrent, as is the case for a large class of 
iterated function systems ([17].) Our analysis proceeds by studying the probability measures induced on the space 
of random function compositions (strings) and developing its topological properties, as detailed in the paper. Several 
intermediate results obtained in the process are of independent interest and follow under more general assumptions. 
Our tools are applicable to the analysis of more complex networked control systems (see, for example, [18]) and 
hybrid or switched systems. 

We summarize the organization of the paper. Subsection 1-B presents notation and preliminaries on moderate 
deviations. Subsection 2-A sets up the problem and prior work is briefly reviewed in Subsection 2-B. Several key 
approximation results are presented in Section 3, whereas the main results of this paper are stated and discussed in 
Section 4. MDP for finite dimensional distributions of the RRE sequence is analyzed in Section 5, whereas Section 6 
systematically carries out the steps required to obtain the main results on MDP for stationary distributions, which 
is completed in Section 7. In Section 8, we present efficient ways to solve the variational problems involving the 
rate function. Numerical studies justifying the theoretical results for a scalar system arepresented in Section 9. 
Section 10 concludes the paper. 

B. Notation and Preliminaries 

Denote by: R, the reals; K M , the M-dimensional Euclidean space; T, the integers; T + , the non-negative integers; 
N, the natural numbers; and X, a generic space. For a subset B C X, Ig : X i — > {0, 1} is the indicator function, 
which is 1 when the argument is in B and zero otherwise; and id^ is the identity function on X. A metric space X 
with metric dx is denoted by the pair (X, dx)- The corresponding Borel algebra is denoted by B(X). For x € X, 
the open ball of radius e > centered at x is denoted by B E (x), i.e., B E (x) = {y e X \ d x {y 1 x) < e}. The closure 
of B e (x) is the closed ball of radius e > centered at x and is denoted by B e (x). For any set T C X, the open 
e-neighborhood of T is given by 



which is a closed set. For a set T C X, we denote by T° and T its interior and closure respectively. 
The Banach space of symmetric matrices 

Let denote the separable Banach space of symmetric N x N matrices, equipped with the induced 2-norm. 
The subset of positive semidefinite matrices is a closed, convex, solid, normal, minihedral cone in S N , with 
non-empty interior §+ + , the set of positive definite matrices. The cone induces a partial order in § w . For 
I,Ye S N , we write X r< Y (Y h X) to denote Y - X e S£; X -< Y to denote X <Y and I / F; I « 7 
(Y > X) to denote Y - X e S^ + . 

Limit Notations Let h : R i — > R be a measurable function. 

The notation \im z ^ x f(z) = y implies that for every sequence {z n } ne ^ in R with lim^oo \z n — x\ = 0, we have 
lim^oo \f(z n ) — y\ =0. The notation lim^ f(z) = y implies that for every sequence {z n } ne jq in R with z n < x 




(2) 



It can be shown that T £ is an open set. Similarly, the closed e-neighborhood of T is given by 




(3) 
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and linin^oo \z n — x\ = 0, we have lim^oo \f(z n ) — y\ = 0. The notations I and f have similar implications when 
working with limit inferiors and superiors. 

Probability measures on metric spaces: Let: (X,dx) a complete separable metric space X with metric dx', 
M(X) its Borel algebra; B(X) the Banach space of real-valued bounded functions on X, equipped with the sup- 
norm, i.e., / e B(X), ll/H = sup xeX |/(a;)|; and C b (X) the subspace of B(X) of continuous functions. Let V(X) 
be the set of probability measures on X. For /j, e A4(X), we define the support of /j,, supp(^), by 

supp(» = {x e X | (jl{B £ (x)) > 0, Ve > 0} (4) 

It follows that supp(/i) is a closed set. The sequence {/x t } te T + m V{X) converges weakly to fi e V{X) if 

lim <f,fH> = <f,fi>, V / e C b (X) (5) 

t— >oo 

Weak convergence is denoted by [i t =>■ fi and is also referred to as convergence in distribution. The weak topology 
on V(X) generated by weak convergence can be metrized. In particular, e.g., [19], one has the Prohorov metric 
dp on V(X), such that the metric space (V(X),d p ) is complete, separable, and a sequence {fJ,t}ter + m V{X) 
converges weakly to fj, in V(X) iff lim t ^oo d p (fit, m) = 0- The distance between two probability measures 
in V(X) is computed as: 

dp (fj,i,fj, 2 ) = inf {e > | ^(F) < ^(J^s) + s, V closed set J 7 } (6) 

Moderate Deviations: 



Definition 1.1 Let {n 1 } be a family of probability measures on the complete separable metric space (X,dx) 
indexed by the real- valued parameter 7 taking values in (0, 1). Let h : (0, 1) 1 — > R+ be a non-decreasing function 
on (0, 1) with 

\imh(j) = 00 (7) 

Let I : X 1 — > M + be an extended valued lower semicontinuous function. The family {^i 7 } is said to satisfy a 
moderate deviations principle (MDP) with rate function /(•) at scale ^1(7) as 7 t 1 if the following holds: 

lim inf - T , ln/i 7 (O) > — inf I(x), for every open set O E X (8) 

7fi /i(7) ~ xeo w ' 

lim sup - 1 ln/i 7 (J 7 ) < — inf I{x), for every closed set T e X (9) 

7fi "(7) xeJr 

The function J( ) is called the MDP rate function. The lower semicontinuity implies that the level sets of /(•), i.e., 
sets of the form {x G X | I(x) < a} for every a e R + , are closed. If in addition, the levels sets are compact (for 
every a), /(•) is said to be a good rate function, and the corresponding family {/* 7 } is said to satisfy an MDP with 
good rate function /(•). 

It can be shown that the MDP, as stated in (8)-(9), is equivalent to the following: 

- inf I(x) < Urn inf — !^ln^ 7 (T) < limsup — !— ln^ 7 (T) < - mf_I(x) (10) 

xer° 7 fi h(-y) ^1 h(-j) xe r 

for every measurable set T. In other words, (10) holds iff (8)-(9) hold. 

The above formulation of MDP is similar in spirit to the theory of large deviations principle (LDP). In fact, in 
the above definition, if the scale function h(-) is a polynomial in 7, the family {[i 1 } is said to satisfy an LDP (see, 
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for example, [15], [20].) This conceptual similarity is manifested in some of the proof techniques developed in the 
paper having parallels with their counterparts in the theory of LDP. 

Before interpreting the consequences of an MDP as defined above, we consider the notion of a rare event, which 
is the central motivation to all MDP (and LDP): 

Definition 1.2 (Rare Event) : A set T C B(X) is called a rare event with respect to the family {/i 7 } of probability 
measures, if lim^-fi ^ 7 (T) = 0. In other words, the event T becomes increasingly difficult to observe (i.e., it becomes 
rare) as 7 t L 

Once a rare event T is identified, the next natural question is the rate at which its probability goes to zero under 
/x 7 as 7 t L This is answered by an MDP, which also gives a complete characterization of the family as 7 t !• 
Indeed, from Def. 1.1 it is not hard to see that, if the family {{i 1 } satisfies an MDP, we have for every measurable 
set r £ X: 

Ci(7) e - ,l(7)inf ^ ro/(x) < ^{T) < c 2 (^)e- h ^ )int ^Ti^) (11) 

where c\, Ci : (0, 1) 1 — > R+ are functions, such that, lirriyj-i Cj(7) = 1, i = 1,2. For brevity, we subsequently use 
the notation 

|U 7 (r) ~ e -fc(7)inf. 6 r/(x) (12 ) 

as a short form of (11). 

Now assume V is a rare event and, to avoid unnecessary technicalities, also assume that T is an /-continuity set, 
i.e., inf^gr = ' m £ x eT = mr zer- Then, by (11) we must have inf^gr > (so that the probabilities decay to zero.) 
Thus, (11) implies that the probability of the rare event T decays exponentially at a scale h(j) to zero, inf xer I(x) 
being best exponent (or rate) of decay. Such a characterization of the best decay rate of a rare event is extremely 
important in system analysis and, as will be seen in the paper, offers considerable insight into system design apart 
from providing a complete characterization of the measures [i 1 . 

2. Problem Formulation 

We split the present section into two subsections, Subsection 2-A briefly summarizing the model of Kalman filter- 
ing with intermittent observations, while in Subsection 2-B we review some results from [2] on weak convergence 
of the random error covariance matrices resulting from the above filtering model. 

A. Setup 

We start by reviewing the model of Kalman filtering with intermittent observations in [1]. Let 

x t+ i = Ax t + w t (13) 

y t = Cx t +v t (14) 

Here x t £ M. N is the signal (state) vector, y t £ M. M is the observation vector, w t £ R N and v t £ R M are Gaussian 
random vectors with zero mean and covariance matrices Q and R, respectively. The sequences {w t } t6 T + and 
{vt}tgT + are uncorrected and mutually independent. Also, assume that the initial state xo is a zero-mean Gaussian 
vector with covariance P . Unless otherwise stated, we use the following standing assumption throughout the paper: 
Assumption (E): The pair (C, A) is observable and Q,R are positive definite. The assumption Q ^> implies 
the controllability of the pair (A, Q 1 ^ 2 ). The main results of the paper require all these assumptions, but several 
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intermediate results of independent interest hold under less stringent assumptions. In that case, they are noted 
explicitly. 

The m.m.s.e. predictor x t | t _i of the signal vector x t given the observations {y s }o<s<t is the conditional mean. It 
is recursively implemented by the Kalman filter. The sequence of conditional prediction error covariances, {-Pt} teT+ , 
is then given by 



Pt 
Pt+i 



E [(x t - x t | t _i) (xt - Xi| t _!) T I {y(s)}o<s<t] (15) 
AP t A T + Q- AP t C T (cP t C T + fl) ~* CP t A T (16) 



Under the hypothesis of controllability of the pair (A, Q 1 ^ 2 ) and observability of the pair (C, A), the deterministic 
sequence {Pt} teT+ converges to a unique value P* (which is a fixed point of the algebraic Riccati equation (16)) 
from any initial condition P . [12]. 

This corresponds to the classical perfect observation scenario, where the estimator has complete knowledge of the 
observation packet y t at every time t. With intermittent observations, the observation packets are dropped randomly 
(across the communication channel to the estimator), and the estimator receives observations at random times. We 
study the intermittent observation model considered in [1], where the channel randomness is modeled by a sequence 
i7t} teT+ of i.i.d. Bernoulli random variables with mean 7 (note, 7 then denotes the arrival probability.) Here, 
7t = 1 corresponds to the arrival of the observation packet y t at time t to the estimator, whereas a packet dropout 
corresponds to j t — 0. Denote by y t the pair y t = (y t I( 7t= i) , "ft) ■ Under the TCP packet acknowledgement protocol 
in [1] (the estimator knows at each time whether the observation packet arrived or not), the m.m.s.e. predictor of 
the signal is given by: 



H\t- 



.!=E 



{y.} 



0<s<t 



(17) 



A modified form of the Kalman filter giving a recursive implementation of the estimator in (17) is in [1]. The 
sequence of conditional prediction error covariance matrices, {P t } teT+ , is updated according to the following 
random Riccati equation (RRE): 

Pt = E[(x t -x t , t _ 1 ) (x t -x t , t _ 1 ) T | {y(s)} < s< J (18) 

Pt+i = AP t A T + Q - Jt AP t C T (cP t C T + CP t A T (19) 

A convenient representation of P t is obtained by defining the functions f , fi : 1 — ► as 2 : 

fo{X) = AXA T + Q, VIeSf (20) 

h(X) = AXA T + Q- lt AXC T {CXC T + Ry 1 CXA T , VIe§f (21) 
We then have for all t > 1 

Pt = U- 1 of lt _ 2 o...of 10 (P ) (22) 

Unlike the classical case, the sequence {P*} teT+ is now random (because of its dependence on the random sequence 
{7t} tgT+ .) Thus, for each t, P t is a random element of S+, and we denote by y(iJ ,Po its distribution (the measure 
it induces on S+.) The superscripts 7, P emphasize the dependence of $' Pa on the packet arrival probability 
and the initial condition. We often use the notations P^' P °,E^' P ° to denote probability and expectation operators 

2 /o corresponds to the Lyapunov operator, whereas f\ is the Riccati operator. 
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respectively, when the system is operated with observation arrival probability 7 and initial covariance Po- 
B. Prior work 

The following extends Theorems 9, 10 in [2] on the weak convergence of the RRE sequence. 

Theorem 2.1 Let assumption (E.l) hold. Then, 

• For each 7 > 0, there exists a unique invariant distribution /i 7 s.t. the sequence {Pt} tgT+ (or sequence 

\ u7' P ° > of measures) converges weakly to n 1 from any initial condition P £ S+ . 

I J teT+ 

. Define the set S C §f by 

S = {f h o f i2 o ■ ■ - o f is (P*) I i r e {0, 1}, 1 < r < s, s e T+} (23) 

Then 3 , if < 7 < 1, 

supp (/P) = cl(S) (24) 

where cl(<S) denotes the topological closure of S in and supp denotes the support of a probability measure 
([21]). In particular, we have 

ff<({Y e §^ I Y y p*}) = 1 (25) 

The proof is in Appendix B. For a detailed discussion of the above results, the reader is referred to [2]. 

3. SOME KEY APPROXIMATION RESULTS 

In this section we present some results on random compositions of Lyapunov and Riccati operators leading to 
the RRE sequence (Subsection 3-A.) In Subsection 3-B, we present some key approximation results that are of 
independent interest and establish several useful properties of the RRE and the classical Riccati operator. 

A. Preliminary Results 

The RRE sequence is an iterated function system (see, for example, [17]) comprising of random compositions 
of Lyapunov and Riccati operators. To understand the system, we study the behavior of such random function 
compositions, where not only the numerical value of the composition is important, but also the composition pattern 
is relevant. To formalize this study, we start with the following definitions: 

Definition 3.1 (String) : Let Po G S + . A string 1Z with initial state P and length n e T + is a (n + 1) -tuple of the 
form: 

K={f il ,f i2 ,---f in ,P ), ,i n € {0,1} (26) 

where / and /1 correspond to the Lyapunov and Riccati updates in eqns. (20,21). The length of a string 1Z is 
denoted by len(T^). The set of all possible strings is denoted by S. 

Remark 3.2 Note that a string 1Z can be of length 0; then it is represented as a 1 -tuple, consisting of only the 
initial condition. We introduce notation here. Let t\,t2,- • • ,U be non-negative integers, such that, 2~2i=i U = n 

3 In the definition of S ((23)), s can take the value 0, implying P* e S. 
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and i k - e {0, 1} for 1 < j < t k , 1 < k < I, such that, for all k, i k = i\, 1 < j < t k . Let TZ be a string of length 
n of the form: 

ft = (/ij i ' ■ ■ > /ijj > " ■ ■ )/»?)■■■ ' /i? 2 ' " ■ ■ > /»[ > ' ■ ■ . /»{ j Po) (27) 
where the indices satisfy the relations above. For brevity, we will write 7\L as: 

ft= ,/|,Po) (28) 

For example, the string (/ , /i,/i,/i,/ , /o, Po) is written concisely as (/ , /f , P ). 

Definition 3.3 (Numerical Value of a String) : To every string 7\L is associated its numerical value, denoted by 
Af(1Z), which is the numerical evaluation of the function composition on the initial state P , i- e -, f° r ft of the form 

ft = (/»i,/i 2 >- ■■/»„, -Po), ,«n e {0,1}, we have 4 

AA(^) = / il o/ i2 o...o/ in (P ) (29) 

Thus, the numerical value can be viewed as a function jV(-) from the space S of strings to We abuse notation 
by denoting Af(S) to be the set of numerical values attainable, i.e., 

Af(S) = {ATCJl) \TleS} (30) 

Remark 3.4 Note the difference between a string and its numerical value. Two strings are equal iff they comprise of 
the same order of function compositions applied to the same initial state. In particular, two strings can be different, 
even if they evaluate to the same numerical value. 

Definition 3.5 (Concatenated Strings) : Let n € N and t\ < t 2 < • • • < t n € T + . Also, if t n > 1, choose 
ii,i2,--- ,H n € {0,1}. Then 

ft = i /iti-i ' ' ' > /»u Po) j (/i t2 , - - - j fi H 7 ' ' ' i /in Po) > " " " > (/i t „ > " " " j /in Po)) (31) 

is a concatenated string of block length n with initial state P 5 - 

We similarly define the numerical value of such a concatenated string TZ by 

AA(ft) - (f Hi o-.-of n (P ),f it2 o • • • o f iH o ■ ■ ■ f tl (P ), • • • , f itn o • • • o f n (P )) (32) 

and note that Af(R) e (g)" =1 S£. 

For fixed P , n,ti <t 2 < ■ ■ ■ <t n , the set of such concatenated strings is denoted by S£°... t . The corresponding 
set of numerical values is denoted by Af(Sf° t ). 

Finally, for X e ®" =1 S+, the set <S£°... ,t n (X) C <S^°... tn consists of all strings with numerical value X, i.e., 

S£... >tn (X) = {ft e S t p °... u | A/" (ft) = X} (33) 

A rigorous algebra of such strings can be developed, which we undertake elsewhere. In the following, we present 
some important properties of strings to be used later (see Appendix A for a proof): 

4 For function compositions, we adopt a similar notation to that of strings, namely, for example, we denote the composition /o ° /l ° /l ° 

h o fa o fo (Po ) by f o ff o / 2 (P ). 

5 We again adopt the convention that the first block is simply Pq if t\ = 0. 
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Proposition 3.6 (i) For s, t £ T + and s < t, we have, A/" («Sf ) C A/" (<S 4 P ). In particular, if for some X e 
S£,t \ . and i to e {0, 1}, the string K = (f n , • • • , / ito , P*) belongs to S£* (X), we have 

(/n . ' ' ' . K > fl~ to > p *) e 5f * (X) c S p * (X), Vt > t (34) 

(ii) Fix n e N and i x < • • • < t n £ T + . We then have, for < 7 < 1, 

^' Po ((P tl ,--- ,Pt n ) eAf (s t p °... , t J) =1 (35) 

(iii) Let t e T + and 7^ e Sf° = (/j 15 • • • ,fi t ,Po) be a string. Then, there exists a P o e K + , depending on P°, 
such that, 

/o W (a P o7)^A^(^) (36) 

where 

^^fEU^fe) **>i (37) 

otherwise 

counts the number of / 's in 7?.. 



B. Some approximation results 

In this subsection we present several approximation results to be used in the sequel. The results are of independent 
interest and establish some useful properties of the RRE and the classical Riccati operator. 

The first concerns uniform convergence properties of the classical Riccati operator and is used in the sequel to 
obtain various tightness estimates required for establishing the MDR The proof is provided in Appendix A. 

Lemma 3.7 For every e > 0, there exists t e > N, such that, for every X £ S+, with X y P* , 

\\fl(X)-P*\\<e, t>t E (38) 

Note, in particular, that t £ can be chosen independently of the initial state X. 

The following result can be viewed as a corollary to Lemma 3.7 and concerns the Lipschitz continuity of finite 
compositions of the Riccati operator. 

Lemma 3.8 For fixed t £ N and i\, ■ ■ ■ , i t £ {0, 1}, define the function g : §v^ 1 — ► §^ by 

g(X) = f il o...of it (X), X£S^ (39) 

Then g(-) is Lipschitz continuous with some constant K g > 0. 

Also, for every e 2 > 0, there exists t £2 , such that, the function f[ E2 (•) is Lipschitz continuous with constant 

K t e2 < £ 2 . 
Jl 

Proof: From (174) it follows that the function /i(-) is Lipschitz continuous with constant Kf 1 — cie~ C2 , where 
C\,C2 are positive constants defined in Lemma 3.7. It is also easy to see that the affine function /o(-) is Lipschitz 
continuous with constant Kf = a 2 , where a is the largest singular value of the matrix A. 

It then follows that the function g(-) defined above is Lipschitz continuous, being a finite composition of Lipschitz 
continuous functions. 
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For the second assertion, choose t £2 € N, such that, cie _C2 * E 2 < £2, where c\, C2 > are defined in Lemma 3.7, 
equation (174). It then follows from (174) that, with the above choice of t £2 , the function f^ 2 (•) is Lipschitz 
continuous with constant Kf* < £ 2 . ■ 

The following result concerns stochastic boundedness of the random sequence {P t }teT+ generated by the RRE. 
In particular, it completes the proof of Proposition 8 in [2] by establishing triviality of 7 sb for general observable 
and controllable systems (in [2], Proposition 8 was proved only for systems with invertible C.) 

Lemma 3.9 Assume (A, Q 1 / 2 ) controllable, (C, A) observable. Then 7 sb = 0, i.e., the sequence {P t } is stochasti- 
cally bounded 

lim sup P 7 < p ° (\\Pt\\ > M) = (40) 

M-s-oo teT+ 

for every 7 > and initial covariance state P 

The proof is provided in Appendix B and offers a new insight into the random Riccati equation. For a discussion 
on the consequence and significance of the result, the reader is referred to the text following Proposition 8 in [2]. 
We reemphasize here, that the result establishes the importance of stochastic boundedness as a metric for system 
stability and design as compared to various notions of moment stability. For example, as shown in [1], the critical 
probability for mean stability may be quite large, depending on the instability of A. However, we show that, even in 
the sub-mean stability regime, the system is stochastically bounded (for 7 > 0) and converges to a unique invariant 
distribution. Hence, our analysis offers insight into system design in the sub-mean stability regime. 

The following result on limits of real number sequences will be useful later. 

Proposition 3.10 For J e N and 1 < i < J, let a t : [0, 1) 1 — > [0, 1] be functions with 

u Wt) = . l<t<J (41) 
7TI ln(l-7) 

(we adopt the convention InO = —00 and a* is non-negative with 00 as a possible value. Then 

ln(E*ii^(7)) 

lim V— ^- = - min {a*} (42) 

7ti ln(l -7) ie{o,-,J} li 

4. Main Results and Discussions 

We state the main results of the paper in this section whose proofs are provided in Section 7. 

The following result is a first step to understanding the behavior of the family {^i 7 } of invariant distributions. 

Theorem 4.1 The family of invariant distributions {/x 7 } converges weakly to the Dirac probability measure 5p* as 
7 t 1, i.e., 

lim dp (mVp-) = (43) 

W 

We have the following convergence rate asymptotics: 
For every e > 0, we have 

ln( M 7 (B?(P*))) 

lim sup v , .\ e 1. < - 1 (44) 

Yti M 1 - 7) 

We discuss the consequences of Theorem 4.1. The first assertion states that the family {/i 7 } converges weakly 
to the Dirac measure concentrated at P*, 8p*, as 7 t 1. This is quite intuitive, as with 7 f 1, the filtering problem 
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reduces to the classical Kalman filtering setup with deterministic packet arrival (no dropouts), in which case the 
deterministic sequence of conditional error covariances converges to the fixed point P* . Thus, as 7 \ 1, we expect 
the RRE sequence to behave more and more similarly to the deterministic 7=1 case leading to the convergence 
of the measures [T 1 to Sp* as 7 t 1. However, from a technical point of view this is not obvious, as the case 7 = 1 
may be a singularity. Theorem 4.1 rules out this possibility and shows that the family {^i 7 } viewed as a function 
of 7 is sufficiently well behaved (continuous) at 7 = 1. 

An immediate consequence of Theorem 4. 1 is the following: 

lim /^ (r) = 0, v r n p* = 4> (45) 

Thus, we note that, w.r.t. {/i 7 }, every event T with P* ^ T is a rare event (see Defn. (1.2).) This is intuitively 
clear, because as 7 j" 1, the measures /i 7 become concentrated on arbitrarily small neighborhoods of P* , making 
such an event T very difficult to observe. 

Once the rare events are identified, the next step in the characterization of {^i 7 } is to ascertain the rate at which 
such rare events go to zero, or the rate at which the family {^ 7 } converges to Sp* as 7 t 1. The distance between 
the family {/U 7 } and its weak limit 5p* as 7 f 1 is important for the design engineer, as it relates the loss in 
performance when operating at packet arrival probability 7 < 1. The answer is provided in the second assertion 
of Theorem 4.1, which says that, for every e > 0, the probability of staying away from the e-neighborhood of P* 
decreases as (1 — 7) when 7 j 1, i.e. 6 , 

^ 0(1-7), Ve>0 (46) 

Since the above holds for every e > 0, the probability of any rare event vanishes at least as (1 — 7). Clearly, the 
exact rate of going to zero depends on the rare event in question (for example, how 'far' it is from P* , which 
becomes 'typical' as 7 t 1. 

A complete characterization of the family {/i 7 } requires the exact decay rate of rare events as 7 t 1, and this is 
achieved in the following result that establishes an MDP for the family {^ 7 } as 7 1 1. 

Theorem 4.2 Recall in equation (37) the definition of tt(-). The family of invariant distributions {/i 7 } satisfies an 
MDP at scale — ln(l — 7) as 7 t 1 with a good rate function /(•), i.e., 

lim inf- — — rln/1 7 (O) > - inf I(X), for every open set O (47) 

Yti In(l-7) xeo y ' 

lim sup — - — - _. ln/i 7 ( T) < — inf I(X), for every closed set J 7 (48) 
7t i ln(l - 7) xeT 

where the function I : §^ 1 — > R + is given by: 

I(X) = inf n{H), VX G (49) 

KES P " 

Theorem 4.2 provides a complete understanding of the family of invariant measures {/i 7 } as 7 j 1. First, it 
establishes the important qualitative behavior of {/x 7 }, namely, that rare events decay exactly as power-laws of 
(1 — 7) as 7 t 1- Also the exact exponent of such a power law decay depends on the particular rare event and is 
obtained as the solution of an associated variational problem involving the minimization of the rate function /(•). 
This is relevant for a system designer who can trade-off estimation accuracy with communication required. For 

6 For functions h(-), <?(■), the notation h(^j) = 0(g(~j)) implies the existence of a constant c > 0, such that ^(7) < cg(y) for all 7 6 (0, 1). 
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example, given a tolerance M > 0, we may ask the question, at what operating 7 is the probability of lying outside 
the M -neighborhood centered at P* less than some 5 > 0. Using the notation of (12), we then have 

^{B c M {Pl)-{^lt ix ^ P * )I{X) (50) 

Thus by computing inf XeB c ( P ») /(X), the designer obtains an estimate of the 7 required to maintain a probability 
of error less than 5. Thus the estimation of probabilities of rare events reduces to solving deterministic variational 
problems. As shown in Section 8, several techniques can be employed to solve these variational problems efficiently. 
Finally, we emphasize that our analysis of reducing the problem of estimating probabilities of interest to solving 
variational problems efficiently is much more definitive and relevant than numerically estimating the invariant 
distributions. A naive numerical approach of simulating the distributions {/J, 7 } as 7 f 1 becomes meaningless here 
as the rare events, which are of interest, become increasingly difficult to simulate as 7 t 1. One may take recourse to 
sophisticated simulation techniques like importance sampling (see, for example, [14]), but such approaches require 
characterization of the distributions in question, which is addressed in this paper. 

5. Moderate Deviations (MD) for finite-dimensional distributions 

In this section, we establish MD for finite dimensional distributions of the process {Pt}teT + as 7 f 1. We 
start by setting notation. Fix n € N and t\ < ■ ■ ■ < t n € T + and recall the sets S^°... t and iSt lr .. ,t n (X) for 
X e " =1 §+ . As noted in Proposition 3.6, the random object (P tl , • • • , Pt n ) takes only a finite number of values 
and for all 7 

P w ((4,-,^)eAr(5g..., t J) = l (51) 

We generalize the definition of the functional 7r : <S t p ° 1 — > Z + (eqn. 37) to strings in S[°_... t by 

/— s f E!" 1 ho\(ii) if t > 1 

= { mK3j - (52) 

{ otherwise 

Thus 7r(-) counts the number of /o's in the string 1Z. Also, for X e ®" =1 §+, define 

£(X) = _ min w(R) (53) 

nes?°... , tn (x) 

(We adopt the convention that the minimum of an empty set is 00.) 

The following result shows that the function £ : ®" =1 1 — > K + is a good rate function on ®" =1 . 

Proposition 5.1 The function I : 0)™ =1 1 — > K+ in (53) is a good rate function on ®™ =1 • 

Proof: Clearly, £(■) > 0. Its level sets are compact because its effective domain Vg is finite, where 

V e = jxe(g)§^ I £(X) <ooj (54) 

■ 

We then have the following result giving the MD for the family (as 7 t 1) of finite dimensional distributions 
(Ptn ■ ■ ■ ,Pt„)- The proof is provided in Appendix C. 
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Theorem 5.2 Fix n € N and t\ < ■ ■ ■ < t n € T + and let (P tl , • • • , Pt n ) 7 be the family of of finite dimensional 
distributions indexed by 7, starting from the same initial state P . Then for every B e B (<g)" =1 ), we have 

Inn j^^m (P- P ° ((P tl ,-Ar€B))= mf s J*- •*» (X) (55) 

where 

t t , ^ f ^( x ) ifXe5f° , 

(X) = ^ 1 7 (56) 
00 otherwise 

Remark 5.3 The MDP for finite dimensional distributions gives insight into transient system behavior. More im- 
portantly, it offers considerable insight in guessing the proper rate function governing the MDP for the family of 
invariant distributions. A naive approach to the MDP rate function for the family of invariant distributions is to 
view it as a suitable 'limit' of fixed time rate functions /*(•), provided the latter converges in an appropriate sense 
to a limit /(•) as t — > 00. The intuition being the weak convergence of the random sequence {P t } te T + as t — > 00. 
However, in general, the limit of fixed time rate functions may not be the rate function governing the MDP for the 
family of invariant rate functions. One needs to verify rigorously that the guessed rate function obtained intuitively 
is the actual rate function governing the required MDP. This is precisely the way (at least implicitly) we establish 
the MDP rate function of the family of invariant distributions in Section 6. 

6. MDP FOR INVARIANT DISTRIBUTIONS 

This section constitutes the key technical part of the paper. Apart from establishing the main ingredients for 
proving the MDP results in Section 4, the results are of independent interest and form a basis for understanding the 
characteristics of stationary measures resulting from stable iterated function systems in general. As suggested in 
Remark 5.3, an intuitive guess for the MDP rate function of the family {fi' 1 } as 7 t 1 is /(•)■ However, apriori, it 
is not even obvious whether /(•) is lower semicontinuous to qualify as a rate function. Hence, we start by defining 
the lower semicontinuous regularization II of I and establish some of its properties in Subsection 6-A. At this 
point, it is not clear whether I = I L (i.e., I is lower semicontinuous) and, hence, we set to establish the MDP for 
the family {/1 7 } with Jj, as a candidate rate function. The major technical lemmas of this section are presented 
in Subsections 6-B,6-C, where we establish the MDP lower and upper bounds respectively w.r.t. the proposed 
rate function 1^. Our approach is fairly general and is not necessarily restricted to the particular filtering problem 
considered here. 



A. A rate function 

Recall: J : S£ 1 — > R+ by 



I(X) = inf 7r(ft), VX G SV (57) 
nes F " 



where, as usual, we adopt the convention that the infimum of an empty set is 00. Note that, for Ie§™, 

I{X) = ig + lf {X) (58) 

and can be thought of as a natural generalization of the marginal rate functions / t p *(-) for all t. 

However, the function /(•) is not generally lower semicontinuous (as will be seen later) and hence does not 
qualify as a rate function. A candidate rate function for the family of invariant distributions can be the lower 
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semicontinuous regularization of /(•), defined as 

I L (X) = Km inf I(Y), VIe§? (59) 

e^QY£B E (X) ^ 

(note if /(•) is lower semicontinuous then /(•) = -Tl(-)-) 

The following proposition states some easily verifiable properties of II : i — > K+, whose proof is provided 
in Appendix D. 

Proposition 6.1 (i) The function Il(-) is a good rate function on S+. 
(ii) For every I g we have 

I L (X) = lim inf I(Y) (60) 

E ^°Y£B E (X) 



(iii) For any non-empty set T e £>(§^) we have 



inf I L (X) < inf /(X) (61) 

xer xer 



In addition, if T is open, the reverse inequality holds and we have 



inf I L (X) = inf I(X) (62) 
xer xer 

(iv) Let K C be a non-empty compact set. We have 

lim inf I L (Y) = inf I L (Y) (63) 
£-»0y £ 5 rex 

B. The MDP lower bound 

The following result establishes the MDP lower bound for the sequence {/i 7 } of invariant distributions as 7 t 1. 

Lemma 6.2 Let T e B (§+)■ Then the following lower bound holds: 

liminf- — rlnn 7 (r) > - inf I L (X) (64) 

7fi ln(l -7) ~ xer° 

Proof: Let Pq be an arbitrary initial state and \ P7' P ° } be the sequence generated by the RRE for 

I J teT + 

7 € (0, 1). It was shown in Theorem 9 in [2], that, for such 7, the sequence \ R 7 ' > converges weakly to an 

I J teT + 

invariant distribution ^i 7 , i.e., 

liminf P 7 < p ° (p 7 ' P ° G > ^ (O) , V open set O C S£ (65) 

lim sup P 7 ' P ° (p 7 ' P ° £j)<^ (J 7 ) , V closed set JcSf (66) 

Now consider the measurable set T € B (§+) and let X e r° n I?/, where 2?/ is the effective domain of /(•). 
Then, there exists e > 0, sufficiently small, such that the closed ball B £ (X) e T. From (66) it then follows 

^ (r) > m T (S £ (X)) > limsu P P 7 < p ° (p 7 ' P ° e J") (67) 

We now set to estimate the R.H.S. of (67). 

To this end, recall the nonempty set S p " (X) of all strings of finite (but arbitrary) length with initial state P* and 
numerical value X. For some to G T + and i\, ■ ■ ■ , i to e {0, 1}, let the string 1Z = o • • • o fa (P*) g <S P (X). 
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Define the function g : §^ i — > §^ by 



N 



g(Y) = f il o...of ito (Y), vres? 



(68) 



The function g(-) is continuous (being the composition of continuous functions) and hence there exists E\ > 0, 
such that 

\\g(Y)-g(P*)\\<e, VY eB £l (P*) (69) 



Also, by Lemma 3.7, there exists t £l e T + , such that 

\\fl(Y) - P*\\ < El , Vt>t £l , Y e§^ 

It then follows from eqns. (69,70), that, for any t e T + , such that t > t + t £l and any string TZi e S P ° of the 
form 

^1 = {fii i ' ' ' i fit ' A 1 ' fji ' " i fjt-t -t E1 ) -fo^ 

where ji, • • • ,j t -t -t ei € {0, 1}, we have 



(70) 



(71) 



(72) 



Indeed 



= \9(fi 1 (fh---°h- to - tei (Po)))-g(P*) 

< e 

where the last step follows from the fact, that, 

|/i tE1 W,, (Po))-P* 

by (70). 

Now for 1Z (defined above), t > t + t £l , consider the set of strings 



< £l 



(73) 



(74) 



| (^fii i ' ' ' > fit 'A 1 ' Ai ' fjt-t -t £l s 



Jl) ' 1 jt—to—t 



^{0,1}} 



(in other words, the indices ji, • • • ,jt-t -t ei can t> e arbitrary.) 
It then follows from (72) above, 

AA(ft 2 ) eB e (X), VR 2 eK t 
From the iterative construction of the sequence {P?' Pa } t £T + it is then obvious, for t > t + t £l , 



(75) 



(76) 



Ji,— ,jt-t +t ei £{0,1} 

= (i - ^y^^o—K^^-i 

= (1 - ^W^o+^i-^W 



IJ(l-7) 1 - i *7 i * 



fc=i 



fe=i 



(77) 
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From eqns. (67,77) we have 

fff (T) > (1 - 7)^)7* " +t «i (78) 

Noting that ln(l — 7) < and passing to the limit in the above we have 

lim inf - ./ A nfP (r) > -n(1Z) (79) 
7fi ln(l — 7) 

Since the above holds for all 1Z G S p " (X), we have 

lim inf - .} _ ln^(r) > sup (-tt(^)) = - inf (7r(ft)) = -7(X) (80) 
Yfi ln(l- 7 ) rceS"*(x) Res p *W 

The above holds for all X G T° n I?/ and hence 

liminf-— — -W(r)>- inf I(X) = - inf 7(X) (81) 

7fi ln(l-7) w- Xer o nX , 7 Xer o 

where the last step follows from the fact that, for X ^ 2?/, /(X) = 00. To establish the Lemma, it suffices to show 
the R.H.S. of (81) satisfies 

- inf I(X) = - inf I L (X) (82) 
xer° xer° 

which follows from Proposition 6.1 Assertion (hi), since T° is an open set. 

■ 

We extract the following result for later use, which follows from the arguments in the proof of Lemma 6-B 
culminating to (78). 

Corollary 6.3 Let O C be an open set and 1Z C S p be a string, such that, Af(lZ) € O. Then, there exists a 
positive integer tji t o (depending on 1Z and O), such that, 

^(0)>(l-7r (R) f El0 " (R) , V7G[0,1] (83) 

C. The MDP upper bound 

In this subsection, we establish the MDP upper bound for the family of invariant measures as 7 t 1. The proof 
is carried out in essentially three stages. First, we establish the upper bound for compact sets and this is done in 
Lemma 6.7. Then, in Lemma 6.8 we prove a tightness result on the family of invariant distributions and finally 
establish the MDP upper bound for closed sets in Lemma 6.9. 

We start with the following result on topological properties of strings. We need the following definition. 

Definition 6.4 (Truncated String) Let the string 1Z be given by 

ft =(/<!,"■ ,fi t ,Po) (84) 
where t G T + , i\, ■ ■ ■ , i t £ {0, 1} and P € . Then for s < t, the truncated string 1Z S of length s is given by 

ft s = (/u,--- Ji s ,Po) (85) 
Lemma 6.5 Let T G §^ be a closed set. Define the set of strings U C S F by 

U{F) = \ll g S p ' I N{1Z) G jj (86) 
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and let 



£(F) = inf nCR) (87) 



(we adopt the convention that the infimum of an empty set is oo.) Then, if £{T) < oo, there exists tjr g T + 
sufficiently large, such that, for all 1Z G U(T) with len(T^) > tjr, we have 

n{K t:F ) >£(T) (88) 

For a proof see Appendix D. We present the following remark. 

Remark 6.6 (i) It follows from the definitions that 

U(T) =UxerS P *(X) (89) 

and hence 

£(F) = inf inf tt(K) = inf I(X) (90) 

xernes p *(x) xef 

(ii) If £{F) < oo, i.e., the set U(F) is non-empty, the infimum in (87) is attained, i.e., there exists 1Z* £ U{F), 
such that, 

n{TZ*)=£(T) (91) 

This follows from the fact that the function tt(-) takes only a countable number of values. Similarly, in the 
case £{F) < oo, if we define X* = J\f(R*), then 

I(X*) = inf I(X) (92) 

We now prove the MDP upper bound for the family {fi 1 } as 7 1 1 over compact sets. 

Lemma 6.7 Let K e ) be a compact set. Then the following upper bound holds: 

Urn sup- , ./ _M ^(K) < - inf I L (X) (93) 
yj-i m(l — 7) xeif 

Proof: For every e > 0, define the e-closure if e and the e-neighborhood K £ of 7f by 

3^ = jx e | w£ k \\X - Y\\ < ej (94) 

Jf e = |x e s^f I ^ ||x - y|| < ej (95) 

Since K e is open, we have by the weak convergence of the sequence {Pf' P }teT + to /x 7 

liminfP 7 "^* (p?' P ' e ^ > M T (if e ) (96) 

which in turn implies 

liminf P^ p * (p^ P " eK^)> ^{K) (97) 

We now estimate the probabilities on the L.H.S. of (97). Since K e is closed, the results of Lemma 6.5 apply and 
recall the objects U{-) and £(■) defined for any closed set T as: 

U{F) = {lle S p ' I Af(TZ) e J"| (98) 
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£LF) = inf n(Tl) (99) 

with the convention that 1{T) = oo if U{F) is empty. Also, for every t e T + and closed set T define the sets: 

U t {T)=U{J : )r\Sf (100) 

To establish the Lemma we may consider two cases, as to whether £{K) < oo (i.e, JC is non-empty) or not. We 
first consider the non-trivial case £(K) < oo. 

To this end, consider fixed e > and from Proposition 3.6 Assertion (ii) it is easy to see that 

P 7,P* (p7^* g ^j = p%P* (p7,P* g ^ (U t (K £ ))') = 1 (101) 

Since K C K £ and ^(K" ) < oo, we have £(K e ) < oo. The fact that K e is closed and Lemma 6.5 imply there exists 
tj^ e T+, such that, for every string K e W(^) with len(ft) > we have tt (^*^) > £(^ £ ). In other words, 
we have for all t > tj^, 

tt > £(K e ), VIZ e Z/*(3Q (102) 

Now consider t > tj^ and define the set of strings J t p by 

Jf = {KeSr tt > l(K e )} (103) 

In other words, the set jf consists of all strings 1Z of length t, such that there are at least £{K e ) occurrences of 
Jo's in the truncated string . The following inclusion is then obvious for t > tj^: 

U\Kl) C jf C Sf (104) 



By the Markovian dynamics of the RRE, it is clear, that for t > t 



K c 



F^' (P^ € Af{jD) = £ (!" 7)' r( ^ ) 7*- w(TC) < ( ^ ) (1 - lY {K ' ] (105) 
We then have from eqns. (101,104) 



li<(K) < liminf P^> p * (p"' p ' eAf(U t {K E ))] < liminf P^' P * fp t 7 ' P * g^')) < f 

t->oo V / t-s-oo V / \i{K E )J 

Taking the log-limits on both sides and noting that ln(l — 7) is negative, t^- is independent of 7 we have 



) 

(106) 



limsup--— 1 An^(K) < -In ( [ ^ | ) lim — -]imt(K £ ) = -£(K S ) (107) 

mi l ln(l-7) ^ V ; " \\£(K E )JJ 7tiln(l-7) =m k ' V ^ 

Taking the limit on both sides as e — > we have 

limsup- , , 1 — ln^(JC) < - lim^(3Q (108) 

yj-l m(l — 7) e^0 

From Lemma 6.1 (Assertion (iii)) we have for all e > 

£{IQ = inf_7(F) > in£_I L (Y) (109) 

Y£K E YeK E 
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Taking the limit and using Lemma 6.1 (Assertion (iv)) we have 

lim£(K~ £ ) > lim inf_7 L (F) = inf I L (Y) (110) 

The Lemma then follows from eqns. (108,110). ■ 
The following tightness result enables us to extend the upper bound from compact sets to arbitrary closed sets 
(see Appendix D for a proof.) 

Lemma 6.8 The family of invariant distributions |^ 7 } satisfies the following tightness property: For every a > 0, 
there exists a compact set K a C S+, such that 

lim sup- 1 fP(K%)<-a (111) 
rfi M 1 - 7) 

We now complete the proof of the MDP upper bound for arbitrary closed sets by the upper bound for compact 
sets and the tightness estimate obtained in Lemma 6.8. This has parallels with the theory of large deviations, where 
one establishes the LDP upper bound first for compact sets. The tightness analogue of Lemma 6.8 here is called 
exponential tightness in the context of LDP. It can be shown that the LDP upper bound for closed sets follows 
from that of compact sets and exponential tightness, see, for example, [15]. Here, although we are concerned with 
a MDP, the proof philosophy is related, i.e., we first establish the MDP upper bound for compact sets and then use 
the tightness estimate of Lemma 6.8 to extend it to arbitrary closed sets. 

Lemma 6.9 Let T <G be a closed set. Then the following upper bound holds: 

lim sup- , ./ — ln^m < - inf I L (X) (112) 
7fi ln(l - 7) x ^ 

Proof: Let a > be an arbitrary positive number. By the tightness estimate in Lemma 6.8, there exists a 
compact set K a C S+, such that, 

lim sup- 1 ^{KC)<-a (113) 
rfi m (! - 7) 

The set T n K a is compact, being the intersection of a closed and a compact set, and hence the MDP upper bound 
holds from Lemma 6.7, i.e., we have 

lim sup - .} _. ln/i 7 LF D K a ) < - inf I L (X) (114) 
Ty-^i ln(l - 7) xernK a 

To estimate the probabilities n 1 (J 7 ), we use the decomposition: 

= m t {? n K a ) + (T n K c a ) < y? ( T n K a ) + f (K%) (1 15) 

By Lemma 3.10 we then have 

lim sup - 1 ln^ 7 (J 7 ) < max ( lim sup - 1 ln^ 7 (T n iff) , lim sup - 1 ln^ 7 (iff) ] 
7ti ln (l-7) \ 7ti m (l-7) 7ti ln(l - 7) 



(116) 
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From eqns. (94,95) we then have 



lim sup — - . rlnu 7 (J 7 ) max ( — inf Il(X),— a 

in ln(l -7) V xeJ^nK a 

< max ( — ^inf_ II (X) , — a 



= -mm\^in^I L (X),aj (117) 

Since (117) holds for all a e R + , passing to the limit as a — > 00 on both sides we obtain 

lim sup- , ,} — ln^m < - inf 7 L (X) (118) 
7fi ln(l-7) xef 

This establishes the MDP upper bound for arbitrary closed sets. ■ 

7. Proofs of Theorems 

Theorem 4.2: The MDP lower and upper bounds obtained in Lemma 6.2 and Lemma 6.9 respectively show 
that the family |^ 7 } satisfies an MDP at scale — ln(l —7) as 7 t 1 with a good rate function Il(-). To complete 
the proof of Theorem 4.2 it suffices to show that /(•) = i-e-, the function /(•) is lower semicontinuous. We 

now show that 

I(X) = I L {X), VXeS^ (119) 
Clearly, if Il(X) = 00, the claim in (119) follows from (212). We thus consider the case Il(X) < 00. Since 

I L (X) = lim inf I(X) (120) 

e^0YeB E (X) 

and the integer-valued quantity mf Y eB E (X) I(X) is non-decreasing w.r.t. e, there exists e > 0, such that 

inf I{X)=I L (X), Ve<e (121) 

YeB E {X) 

The infimum above is achieved for every e > 0, and we conclude that there exists a sequence {X„}„ 6 n, such that 

X n eW„(X), limX n =X, I(X n ) = I L (X) (122) 

n— >oo 

Recall the set of strings 

U(W (X)) = {KE S p ' I Af(TZ) g B7 a (X)} (123) 

We then have 

e(B £o (X))= mf I(X) = I L (X) (124) 

YeB C0 (x) 

Since B £a {X) is closed, by Lemma 6.5, there exists to € T + , such that, for 1Z G U(B £a {X)) with len(7?.) > f , 
we have 

ir(ll t0 )>e(B ea ) = I L (X) (125) 
By the existence of {X n }, there exists a sequence {lZ n } of strings in U(B £a ), such that 

N{Tl n ) = X n , it (R n ) = I L (X) (126) 
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Note that, without loss of generality, we can assume that len(7?. n ) = to for all n. Indeed, if len(7Z n ) < to, we can 
modify lZ n by appending the requisite number of fi's at the right end, still satisfying (126). On the other hand, if 
\en(lZ n ) > to, we note that lZ n must be of the form 

ttn= (&,■■■ ,fn ,fl en{lln) ~ t0 ,P*) (127) 

where the truncated string 

K° =fh°---°fi t0 (P*) (128) 

satisfies 

N (K°) = X n , ix (K«) = I L (X) (129) 

The second inequality in (129) follows from (125), whereas (128) follows from the fact that ir(lZ n ) = Il{X) 
implying 

7T(ft n ) =tt(7^°) (130) 

Thus the right end of lZ n does not contain any /o, explaining the form in (128). The key conclusion of the above 
discussion is that, if len(7£„) > t , we may consider the truncated string TVfi instead, which also satisfies (126). 
We thus assume that the sequence {lZ n } with the properties in (126) satisfy: 

len(ft„)=i , Vn (131) 

The number of distinct strings in the sequence {lZ n } is at most 2*° (in fact, lesser than that, because of the constraint 
7r(7^„) = I L {X)) and, hence, at least one pattern is repeated infinitely often in the sequence {lZ n }, i.e., there exists 
a string 1Z*, such that, 

len(7e*)=t , 7^(7^*) = I L (X) (132) 
and a subsequence {TZ nk }keN of {lZ n }, such that, 

K nk =K*, VfceN (133) 
The corresponding subsequence {X nk } of numerical values then satisfy 

X nk =M{n nk )=M{TZ*), VfceN (134) 

and hence 

X = lim X nk =N(K*) (135) 

Thus the string 7^* e S p " (X) and hence 

I(X)= mi tt(TZ) < tt (TZ*) = I L (X) (136) 
The other inequality Il{X) < I(X) is obvious and hence we conclude from (136) that 

I L (X) = I(X) (137) 
This completes the proof of Theorem 4.2. ■ 
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Theorem 4.1: Recall 

dp (fP, 5 P *) = inf {e > | 5 P , (T) < ^(T £ ) + e, V closed set J 7 } (138) 
Define the class of sets 

C = {T | T is closed and P* e J 7 } (139) 
Then the following equivalence is straight forward: 

dp (fj?,6 P .) = inf {e > | fP(T e ) +e > 1, VJeC} (140) 

Now consider < e < 1, small enough. Then there exists e > 0, such that for every T e C, we have P £o (P*) C 
(Note that the constant e can be chosen independently of J", but depends on e.) The string 1Z = P* belongs to 
B eo (P*) and hence by Corollary 6.3, there exists an integer t > 0, such that, 

lS(B E0 (P*)) > (l-^Wfo-W = fc (141) 

Thus for all T e C 

^{Fs))>f{B eo {P*))>i ta (142) 
Then for 7 > (1 - e) 1/to we have for all T 

T £ +e>f°+e>l (143) 



It then follows from (140) that 



Hence 



dp(if,5 P .) <e, 7>(l-e) 1/t0 (144) 



limsupcip (^i 7 , 5p*) < e (145) 

Yfl 

Since e > is arbitrary, by passing to the limit as e — > 0, we have the weak convergence 

lim dp (M y ,<5p») = (146) 

For the second assertion we note that for e > 0, the closed set B^(P*) does not contain P*. Hence £(Pf (P*)) > 
1. The claim in (44) then follows from the MDP upper bound for closed sets (Lemma 6.9.) ■ 

8. Computations with the rate function 

A complete characterization of probabilities under the invariant distributions are obtained in Theorem 4.2, which 
shows that the probability of 'rare events' 7 decays as powers of ln(l — 7) as 7 t 1. The best exponent of this power 
law decay is characterized by the MDP rate function /(•). As Theorem 4.2 shows, the best decay exponent of such 
a rare event can be computed as the infimum of the /(•) over that set. This reduces the problem of estimating 
probabilities under the invariant distributions to solutions of a related variational problem, namely, minimizing /(•) 
over Borel sets in S 7 ^. From a numerical analysis point of view, one may simulate the function /(•) and use a 
look-up table to numerically solve the associated variational problems. In this section, we show that, for a class of 
events of interest, the variational problem of computing the best decay exponent can be simplified to a great extent. 

7 The term rare event in this context refers to a Borel set (event) bounded away from P* . 
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In particular, we are interested in estimating probabilities of the form fj, 1 (B^ (P*)) for every e > 0. Theorem 4.1 
shows that such probabilities decay to zero at least as (1 — 7), i.e., for every e > 0, we have 

^(Bf(P*)) = 0(l-7) (147) 

However, this upper bound becomes loose as e increases and the best exponent of decay, i.e., the best power of 
(1 — 7) in (147) can be computed precisely by solving the variational problem in Theorem 4.2. The main result 
of this section shows that, for a particular class of systems, this computation can be extremely simplified and for 
general systems we present bounds on the decay exponent that are simple to obtain in contrast to solving the 
full-fledged variational problem. For a system designer, the probabilities of the form in (147) are of interest and a 
technique to obtain them efficiently is of much use. We start by setting some notation. 
Define the function l : R + 1 — > T + by 

t(M) =inf {keT+ I \\fo(P*) -P*\\ > M) (148) 

Also, define t + : R + 1 — > T + by 

t+(M) = inf {k e T+ | ||/ fe (P*) - P*|| > M) (149) 
We note that t(-) is a non-decreasing right continuous function, and we have for all M > 

t+(M) < b(M) + 1, lim dU)=i+(M) (150) 

U<M:U^M 

Also recall: 

B C M (P*) = {l£§f \\X — P*\\ > M} (151) 
B C M {P*) ={l£§f \\X - P*\\ > M} (152) 

Definition 8.1 (Class A systems) : Let (A, Q,C, R) be a system satisfying the assumption (E.l). Then the system 
is called a class A system if 

S- D {X e §+ I X h f (P*)} (153) 

where S~ is defined in (169). 

We then have the following MDP asymptotics for class A systems (see Appendix E for a proof): 

Lemma 8.2 Let (A, Q, C, R) be a class A system. Then we have for all M > 

limsup- 1 , ln^(£^(P*)) < -i(M) (154) 
7fi ln(l-7j 

and 

^-MT^T)^ (^m(^)) > (155) 

Remark 8.3 Lemma 8.2 shows that for class A systems the variational problem of computing the best decay 
exponent can be simplified to a great extent. In particular, rather than generating the set of all possible strings 
S p (that grows exponentially with the length of the strings) one can systematically look into strings of the form 
/o(P*), k e T + and obtain the decay exponent. The next natural question is whether there exists a suitable 
characterization of class A systems. Determining whether a given system is class A is numerically simple, as 
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one only needs to check the condition in (153). From a theoretical point of view, it would be relevant to offer a 
characterization of class A systems through properties of the system matrices. A detailed study to that end will be 
a digression from the main theme of the paper, and we intend to do it elsewhere. However, we show that scalar 
systems are included in class A, thereby confirming that it is not empty. 

Proposition 8.4 Let (A, Q, C, R) be a scalar system satisfying (E.l) (i.e., Q, R > and C ^ 0.) Then, (A, Q, C, R) 
belongs to class A. 

Proof: Define the function g : R + i — > M. + by 

g(X) = f 1 (X)-X, VX>0 (156) 

where /i is the scalar Riccati operator. Under the assumptions, fx has only one fixed point in R + , P*, the steady 
state solution. Thus, in the domain of interest, 

g(X)=0 iff X = P* (157) 

We note that g(0) > 0, and, by [22], there exists a P * > P* sufficiently large, such that, 

g (X) = f 1 (X)-X<0, VX> a P , (158) 

We now claim that 

g{X) > 0, < X < P* and g(X) < 0, X > P* (159) 

Indeed, if this was not true, then by (158) this would imply the existence of an interval in R + not containing P*, 
such that the sign of g(-) changes over this interval. This in turn would imply from the continuity of g(-), the 
existence of another solution to the equation g(X) = on R + other than P* . Clearly, this contradicts with the 
hypothesis (see (157) and, hence, the claim in (159) holds. 

Since f {P*) > P* , it then follows from (159) that, for X > f {P*), 

h{X)-X = g{X) <0 (160) 

thus showing that scalar systems belong to class A. ■ 

As shown by Lemma 8.2, for class A systems the computation of the decay exponent of rare events can be greatly 
simplified. For general systems such a simplification may not be possible, however, the solution of the variational 
problem in Theorem 4.2 can still be made more efficient rather than searching haphazardly over the set of strings 
S p . The following proposition outlines a simple algorithm for solving the variational problems of interest leading 
to the decay exponents of rare events in general systems: 

Proposition 8.5 Let rcSf and define 

fc r = inf{fc > | Af (/ fc (P*)) E T} (161) 

Define the set 

Jr = {Ue S p * I ff(K) e r and len(ft) < fc r } (162) 

Then 

Mr I{X ^=^ n) (163) 
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Proof: The proof is straight-forward and follows from the fact, that, inixer < fcr and so it suffices to 
look at strings of length fcr at most. ■ 

Remark 8.6 The conclusion of Proposition 8.5 is that, to find the decay exponent of a rare event T, one may 
compute fcr according to (161) and then the minimizing string can be found in the set J7r as constructed above. 

9. A Scalar Example 

We present a numerical study to demonstrate the efficiency of our approach over extensive Monte-Carlo type 
simulations to estimate the decay rate of rare events. 

Consider a scalar system with parameters: A = \f2, C = Q = R = 1. Solving the algebraic Riccati equation, 
X = ,fi(X), we obtain P* = 1 + \/2. By Proposition 8.4, we note that the system is of class A. By Lemma 8.2, 
we then have for M > 0, 

-l+{M) UmiDf- ln(1 1 __ ) ln//f (B°(P*)) < Urn sup _ } ln M 7 (B°(P*)) < —i(M) (164) 

(note, we use the alternative MDP representation, (10).) 

Now choose Mi = 40 — P* and we estimate the decay rate of the rare event B^ Ii (P*) as 7 — > 1. Using the 
definitions and that 

P* < / 3 (P*) < M 1 + P* < / 4 (P*) (165) 

we note that 

t(Mi) = l+(Mi) = 4 (166) 
By (164), Bm (P*) is an /-continuity set (see text following (12)), hence the limit in (164) exists and we have 

H tmin(i L_ln^(<(P*))=4 (167) 
Our theory then predicts, that, for 7 close to 1, 

^(^(P*))~(l-7) 4 (168) 

We now estimate the empirical decay rate of the event through extensive numerical simulations. We simulate 
different values of 7, in the range [.55, 1) with a step size of .05. For each such value of 7, we obtained 10 4 samples 
from the invariant measure /x 7 (this is needed as the event (S^(P*)) becomes increasingly difficult to observe 
as 7 approaches 1.) Obtaining a sample from zx 7 is also numerically intensive, and we iterated the RRE 100 times 
to make sure the random covariance converged in distribution to /i 7 . Thus, a total of 9 x 10 4 x 10 2 computations 
were involved. The resulting empirical cumulative distributions functions (cdf) are plotted in Fig. 1 (on the left). 
We note that, as 7 — > 1, the empirical invariant measures converge to 5p*, thus verifying Theorem 4.1^ 

To obtain the empirical decay rate of B'% Il (P* ), f° r eac h 7, we numerically estimate the quantity — ^^yr^y~ ~ 
from the empirical cdfs obtained above. This is plotted in Fig. 1 (on the right) as a function of 7 (the solid line). 
The result agrees with our theoretical findings: (1) qualitatively, the rare event decays as a power law of (1 — 7); 
(2) the best decay exponent approaches 4 (which is theoretically established in (167)) as 7 gets closer to 1; (3) 
even for 7 much less than 1, the empirical decay rate is close to 4, justifying (168). 

This example truly demonstrates the relevance of our theoretical findings. Even for the simple scalar case, a 
modest numerical estimation of the probabilities of rare events required computations of the order 10 7 , whereas, 
our theoretical findings provide the best decay exponent by solving a much simpler variational problem. 
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Empirical CDF 6 




X 



Fig. 1. Left: Weak convergence of (empirical) measures fi 1 to 5p* as 7 — » 1. Right: Decay exponent of probability of the rare event 

b c Mi (p*i 



10. Conclusions 

The paper studies the RRE arising from the problem of Kalman filtering with intermittent observations. We show 
that for every 7 > the conditional mean-squared error process is ergodic and the resulting family of invariant 
distributions {/i 7 } (as they converge weakly to 5p* as 7 f 1) satisfies a MDP with good rate function /. The rate 
function / is completely characterized and the asymptotic decay rate of rare events characterized as solutions of 
deterministic variational problems. The intermediate results obtained are of independent interest and our methods are 
fairly general to be applicable to the analysis of more complex networked control systems (see, for example, [18]) 
and hybrid or switched systems. 
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Appendix A 
Proofs of Proposition 3.6, Lemma 3.7 

Proof of Proposition 3.6 

Proof: Assertion (i) follows from the fact, that, /1 (P*) = P* , whereas Assertion (ii) is obvious from the iterated 
construction of the sequence {Pt}teT + - We prove Assertion (iii) now. 
Following Bucy ([22]), we define the set 

S~ = {x 6Sf I ^l} (169) 

Under the assumptions of observability and controllability, it can be shown (see [22]), there exists /3 £ R+, such that, 

jx G I X y pl\ C S~ (170) 

Now choose a P o > /3, such that, P° ^ a p al. Then a P o £ S~ and it follows from the order-preserving property of the 
operators fo and fi, that, 

A/" (71) r< Af(Tli) (171) 

where TZi = (fi 1 , ■ • • , fi t , a P oI). Note that the claim is trivial for t = 0, so assume t > 1. Let j\ < ■ ■ ■ < j^(tc) be the 
indices corresponding to fos in 1Z from left to right. Consider the last segment of IZi, i.e., the string (/o,/i ,T(K) ,apoA 
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We then have 

/oo/, ^ (K) (apo/)^/o(Qpo/) (172) 

This follows from the fact, that, fi (ot P aI) < a P oI, which implies the sequence {ff {a P oI)} s >o is decreasing and we have 

fi (a P oI) r 

particular, we have 



fl J,r<K) (a P oI) ■< a P oI. Since a P o > f3 and A is unstable, we have fo (a P oI) >z a p al >z PI and fo (a P oI) C S . In 



N {Hi) =< /n o • • • -i {fo {a P o I)) (173) 

using the order-preserving property of the functions fo, fi- In a similar way, we can repeat the above argument inductively 
starting with the string on the R.H.S. of (173) and arrive at JV {TV) < /g {a P oI). The claim then follows from (171.) 

■ 

Proof of Lemma 3.7 

Proof: We use the following result on uniform convergence over compact sets of the Riccati iterates to P* . Let Xi,X 2 € S+ . 
Then it can be shown (see Theorem 7.5 in [23]) 8 that there exist constants ci, ci > 0, such that, 

||/f (Xi) - f{(X 2 )\\ < Cl e- C2t ||Xr - X 2 \\ (174) 

Taking X 1 = X, X 2 = P* and noting that fi{P*) = P* we have from the above 

\\ft{X) - < Cl e- C2t - P*\\ (175) 

Thus for every compact subset K G §+, there exists t e , depending on K, such that, 

\\ft{X) -P*|| < e, XeJf, i>4 (176) 

To prove the Lemma, we need to transform the above uniform convergence over compact sets to uniform convergence over 
the entire space . To this end, we note that for a observable and controllable system, the following uniform boundedness of 
Riccati iterates holds from arbitrary initial state X £ (see, for example, Lemma 7.1 in [23]): 

f{X)< ai I, t>N, VIeSf (177) 



where ai € K + is a sufficiently large number. 
Now consider the compact set 



From (177) we have 

Now from (176) choose t E , such that, 



i^ = {xe§~ | ||X|| <a} (178) 
f{X)eK a , t>N, VIeSf (179) 



\\fi{X)- P*\\<e, XeK a , t>t e (180) 
Then defining t e — t E + N we have from eqns. (179,180) 

\\ft{X)-P*\\<e, t>t e (181) 

and the result follows. ■ 



Appendix B 

Stochastic boundedness of error covariances 

Proof of Lemma 3.9 

Proof: We consider the case of unstable A. For stable A, the proposition is trivial and follows from the fact, that, the 
unconditional variance of the state sequence reaches a steady state (hence bounded), and a suboptimal estimate x t = for all 
t leads to pathwise boundedness of the corresponding error covariance. In fact, in this case even 7 = leads to stochastic 
boundedness of the sequence {P t } from every initial condition. 

The proof follows the same line of arguments used in Proposition 6 of [2], where the above claim was established for the case 
of invertible C. The key ingredients used there consisted of uniformly bounding the Riccati operator (in case C is invertible) 
and then estimating the probability that the random sequence {Pt}tet + exceeds a particular range by relating it to the length 
of the random time intervals between packet arrivals. 

In the general case, as shown below, instead of bounding the one-step Riccati iterates, we bound iV-step Riccati iterates for 
controllable and observable systems and then repeat the arguments in [2] with more generality. 



8 The result in [23] applies to time-variant system matrices under the assumptions of uniform complete observability and uniform complete 
controllability, which reduces to the observability and controllability for the time-invariant system matrices considered in this paper. 
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To this end, we start with the following result on boundedness of JV-step Riccati iterates. It can be shown (see Lemma 7. 1 
in [23]) that for controllable and observable systems 9 the following holds: 

ft (X) r< kI, t>N,X£§+ (182) 

where a-z e R+ is sufficiently large. In other words, the above states that an application of the Riccati operator more than N 
times in succession leads to a covariance bounded above by a specific constant, irrespective of the initial state. 
Now, for M € T + and sufficiently large, define 

{2k _ i ^ 
fc€T+ | K ia 2k + \\Q\\ a2 _ l >M\ (183) 

where a = ||A|| and m = max{/c, ||P(>||}- Since A is unstable (a > 1), it follows that k(M) — > oo as N — > oo. To estimate 
the probability P ? ' p ° (||P t || > M) for t G T+, define the random time t by 

t = max{0 < s < t | 7s _ r = 1, l<r<iV} (184) 

where the maximum of an empty set is taken to be zero. Thus, if t ^ 0, 10 it denotes the time closest to t, such that, there were 
N successive packet arrivals in the time interval [t — N, t — 1]. Then, using the above arguments, we have 

\\Pt\\ < K i (185) 
Indeed, if t — 0, then Pj = Po and (185) holds by the definition of ki. On the contrary, if t > 0, we have by (182) 

iifvii = |/r(fv-*)| <«<«i (186) 

We then have 

iipii = n/^.! o . . . o / 7 . (p,)|| < |/,r* (p f )| < K1 a 2 c-') + iiqii £« 2fc = ^ 2(t -* } + wqw ( 187 ) 

fc=l 

where we have used the fact, that, fi(X) X fo(X), V X G §+. It then follows from the above and (183), that, 

P 7 ' P ° (||P|| > M) <P 7 ,p (i-F> fc(M)) (188) 

We now estimate the probability fy,p„ (t — t = fc). 

First, consider the case t / 0. On the event t ^ 0, it is not hard to see that the following events are equal: 

i 

{i-t = k) = { 7t - fc - r = 1, l<r<iV} P| { 7s -r = l, l<r<iV} c (189) 

s=t-k+l 

It then follows by elementary manipulations and the independence of packet arrivals 

Py.Po {t -t = k) = P 7 ,p ({yt-k-r = 1, 1 < r < N} P| { 7s _ r = 1, 1 < r < AT}' 

\ 3 = t-fc+l 

< P7,Po I {7t-*-r = 1, 1 < r < N} Q { 7t _ fc+(i _ 1)iv _ 1+T . = 1, 1 < r < N} c I 
= P 7 ,p ({ 7t - fc _, = 1, 1 < r < N}) 11 P 7 , Po ({j t _ k+(i _ 1)N _ 1+r = 1, 1 < r < N} c 



-n(i-^) 



7' 



< (l- 7 ^) L " J (190) 

9 The result in [23] applies to time-variant system matrices under the assumptions of uniform complete observability and uniform complete 
controllability, which reduces to the observability and controllability for the time-invariant system matrices considered in this paper. 

10 Note that, if not zero, t> N. 
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On the event t = 0, using a similar set of arguments, we can show 

P 7 ,P (i-I=fc) < (l-^) L * J (19D 
We thus have the upper bound (possibly loose, but sufficient for our purpose) 

k=k(M) k=k(M) ' k=k(M) 

Rearranging and summing the geometric series above we have 



(192) 



i 00 r ] 

~ k=k(M) L 

From eqns. (188,193) we have for all t and sufficiently large M 











1 [ 


(1-7* 




7j fc(M) 



1-7^ l-(l- 7 



.JV\1/JV 



(193) 



1 



[(l-7-) 1/iV ] 



fe(M) 



di^ii > < 1 : r L 1 _ {l __ N j )1/N (i94) 

Since 7 > and k(M) — > 00 as M — > 00, it follows from the above 

lim sup P^' p ° (||P t || > M) = (195) 

M->oo tgT+ 

Thus {Pt}tgx + is s.b. (for all initial conditions P ) for every 7 > and hence, the Lemma follows. 

■ 

Appendix C 
Proof of Theorem 5.2 

Proof: For X = {X 1 , ■ ■ ■ , X n } G <g)" =1 S+, we have 

n-1 

P 7 ' Po ((P tl , ■ ■ ■ , Ptnf = X) = P^ p ° (P tl = Xi) J] P 7 ' P ° (P ti+1 = I P u = X.) (196) 

which follows from the Markov property. Clearly, from the above we have 

P 7 ' P ° ((P tl ,--- ,Pt n f = X) =0, X^(s P °..., tii ) (197) 
Also, if X G A/" ^5 P °... tn ^, it follows from (196) through simple manipulations and the independence of the packet dropouts 

P ? ' Po ((P tl ,--- ,P t „) y = x) = £ (198) 



^6<°,..,*„( x ) 



By manipulating each term on the R.H.S. above, we have 



^ mhy) ln ( (1 - = ^ hTaW ^ )ln(1 7) + <*» - lnT] 

= .W + ^-^lim^-^ 

= tt(7^) (199) 
It then follows from Proposition 3.10,eqns. (198,199) that 

lP 7 ' Po ({Pti , ■ • • , f = X) = _ min n{TZ) = <(X) (200) 

(the case X ^ A/" ^5 P °... tn ^ is absorbed above by using the convention, that, the minimum of an empty set is 00.) 

Now consider B G £ ((g)™ =1 §+). If PrW (sf°... ^ = cj>, then the claim is obvious. Hence, assume Pn N (<S t P °... _ tn ) ± 
</> (this intersection is necessarily finite) and note that 

P 7 ' Po ((P tl ,--- ,P n feP) = £ p^ p °((P tl; ... ,P tn r = x) (201) 

xesrw(s£°... tn ) 
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It then follows by Proposition 3.10 



_*_ ln Uf,Po f {P P y €B \\ = min u 1 ln Uf,Po hp ,p tn y = y)) 

1-7) V V v n ' J J YcH o^^ p o \7tiln(l-7) V V v n ' )) 



7nin(l- 7 )"V V^' 1 ' ^ J) " xesn^"(< ...,t„) ^ M 1 ^)' 

min f(X) 

xesrw(s£°...,t n ) 
r*i.- .*n 



= inf 7 tl ''"' t "(X) (202) 
xes 



which establishes the Theorem. 



Appendix D 
Proofs of Results in Section 6 

Proof of Proposition 6.1 

Proof: That Il(-) is lower semicontinuous follows from its definition (see, for example, [24].) Now for a G K + consider 
the level set K a — {X G | Il(X) < a}. By lower semicontinuity we know that K a is closed. We now show that K a is 
bounded and hence compact. To this end, we note that for all b G R+ 

\Y G §+ | I(Y) < b} C |Y G S+ | Y r< / r&1 (a?./)} (203) 

for some constant ap* € R + , which can be chosen independent of b. Indeed, I(Y) < b implies that S F (Y) is non-empty and 

I(Y) = inf ir(H) < b (204) 

Tzes p " (Y) 

Since n(-) takes on integral values only, the infimum above is attained and there exists TZ G S p (Y) with tt(TZ) < b. Then, 
from Proposition 3.6, there exists ap. G E + (depending on P* only), such that 

Y = N{K) < (a P , I) * (ap* I) (205) 

This verifies the claim in (203). 

Now consider si > 0. For X G §+, the sequence ini Y eB E (x) I(Y) is non-decreasing w.r.t. e and hence X G K a implies 

inf I(Y) < a (206) 

YeB E1 (X) 

Since /(•) takes on integral values the infimum is attained and there exists Y(X) G §+, such that, I(Y(X)) < a. From (203) 
we then have 

Y(X) < fi a \a P *I) =► ||F(X)|| < ||/ o r«l (a P , (207) 

Since Y(X) G B £1 (X) we have 

\\X\\ < \\Y(X)\\+e 1 < ||/ ra W/)|| +ei (208) 

We thus note that 

K a c[ZeS^\ \\Z\\ < |/ w (ap./)| +ei} (209) 

which verifies the boundedness of K a . Hence the level sets K a are closed and bounded for all a G K+, establishing the goodness 
of Il(-). 

For Assertion (ii), we note that for arbitrary e > 0, 

inf_/(Y)< inf I(Y) < inf J(X) (210) 

Yes e (x) yes E (Jf) y e s e/2 (x) 

The assertion then follows by passing to the limit as e — > on each side. 

For Assertion (iii), note that, in general we have for arbitrary e > 0, I(X) > inf yeSe (x) I(X) and by passing to the limit 
it follows 

I w^Si^ w 7 w = ^w (211) 

This immediately gives for any set T G Z3(§+) 

inf 7(X) > inf I L (X) (212) 
xer — xer 

For the reverse inequality when T is open, consider I6T. Then there exists ei > (depending on X) such that, for every 
< e < Ei, the open ball B e (X) G T. It then follows 

inf I(Y) > inf J(Y), < e < ei (213) 

V£fl t (x) ~ i-er 
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Taking the limit on both sides we have 



inf ICY) < lim inf ICY) = I L (X) (214) 

Yer ' — e->Oy £ B t (x) 



Thus for every X € T, we have 

7 L (X) > inf 7(F) (215) 
— yer 

Taking the infimum over X G V on the L.H.S. gives the required inequality 

inf IdX) > inf I(X) (216) 
xer — xer 

and the result follows. 

We now prove Assertion (iv). By the notion of limits involving continuous arguments, it suffices to show that for every 
sequence {e n }„gN with 

lim e n = 0, < e„ < 1 Vn (217) 

we have 

lim inf I L (Y) = inf I L (Y) (218) 

n->ooy e x Erl YeK 

To this end consider such a sequence {e„} and assume on the contrary, that (218) is not satisfied. Since K C K Bn , we clearly 
have 

lim inf I L (Y) < inf I L (Y) (219) 

n^oo Y eK Sn YeK 

Thus the hypothesis that (218) is not satisfied implies 

lim inf I L (Y) < inf I L (Y) (220) 

n^rooYeK En YeK 

We note that the sets K £n are compact for all n and since a lower semicontinuous function attains its minimum over a compact 
set, for every n € N, there exists X n £ 7f s „ , such that 

inf_7i(y) = I L (X n ) (221) 

Similarly, there exists Y* € A", such that, 

inf I L (Y) =I L {Y*) (222) 

YeK 

We note that {7i(X n )} is a non decreasing sequence (hence the limit exists) and 

lim I L (X n ) <Il{Y*) (223) 

n— >oo 

Also, given Z € it follows from the continuity of the metric, that the function d z : §^ i — > K+ given by 

d z (Y) = \\Y-Z\\ , VreS+ (224) 

is continuous and hence attains its minimum over a compact set. Thus for every n € N, there exists Y n £ K, such that 

inf \\X n - F|| = \\X n - Y n \\ < e n (225) 
YeK 

where the last inequality follows from the fact that X n £ K Sn . 

We note that the sequence {Y„} belongs to the compact set K and hence there exists a subsequence {Y nk }ken in K, which 
converges to some YeK, i.e., 

lim Y nk = Y (226) 

k— too 

Now consider the sequence {X nfc } fcG M- We then have 

\\x nk - y\\ < \\X n „ - Y nk || + \W nk - Y\\ < e nk + \\Y nk - y\\ (227) 



Taking the limit as k — > oo we obtain 

lim X nk = Y (228) 

fc— >oo 

We then have from the lower semicontinuity of Il(-) and (223) 



7 i (y)<liminf7 L (X n J<7 I ,(F*) (229) 

k— foo 

This contradicts the fact that Y* is the minimizer of Il(-) over K and we conclude that (218) holds. Hence the result follows. 
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Proof of Lemma 6.5 

Proof: The case £(F) = is trivial as the assertion follows by choosing an arbitrary positive tj^. 
We consider the case 1(F) > 1. We use an inductive argument and it suffices to show that for every 1 < i < 1(F) there 
exists positive t % T G T+, such that, for TZ G W(J r ) with len(7?.) > fy, we have 



7T (tt^) 



> i (230) 



We start with i = 1. Assume on the contrary, that there is no such t x T G T + with the above postulated properties. Since U(F) 
is non-empty, by Proposition 3.6 Assertion (i), there exists to G T+, such that 

sfnu(F)^4> (231) 

Thus the non-existence of t x T , implies, that for every t > to, there exists a string TZt G U(F), with len(7?.t) > t, such that, 
7T (TZ\) = 0. Such a string TZt is then necessarily of the form: 

K={tiJ^f^-,h e ^ t) _ t ,r*) (232) 
where ii, ■ ■ ■ , *i en ( TCt )_ t £ {0, 1}. Thus denoting 

* t = /W i2 °---°/ W() _ t (n (233) 

we note that 

M (TZ t ) = f\(X t ) (234) 

Now consider the sequence {TZt} t>tQ of such strings as t — > oo. Let e > be an arbitrary positive number. By Proposition 3.7, 
the uniform convergence of Riccati iterates implies there exists t e > N, such that, for every X G §+, 

||/i(X)-P*|| <e, i>i e (235) 

(we emphasize that the constant t e can be chosen independently of X.) Then defining t s = max(to, t e ), it follows from (234) 
that for t > 4 

||AA(%)-P*|| = H/^XO-P'll < e (236) 

Since e > above was arbitrary, it follows that the sequence {M (7^t)} t > t of numerical values converges to P* as £ — > oo. 
However, by construction, the sequence {M (1Zt)} t>to belongs to the set F and we conclude that P* is a limit point of the 
set F. Since F is closed, we have P* G F. It then follows 

\ll G 5 P * | N(Tl) = P* j C (237) 

Hence, in particular, the string /i(P*) G U(F). The fact, that 7r(/i(P*)) = then contradicts the hypothesis > 1. 

Thus by contradiction we establish that if > 1, there exists t 1 ^ satisfying the properties in (230) for i = 1. Note that, 
if 1(F) — 1, this step completes the proof of the Lemma. In the general case, i.e., to establish (230) for all 1 < i < £(F) we 
need the following additional steps. 

We can now assume 1(F) > 2. We assume on the contrary that the claim in (230) does not hold for all 1 < i < i(F). By 
the previous arguments, the claim clearly holds for i = 1. Then, let 1 < k < 1(F) be the largest integer such that the claim 
in (230) holds for all 1 < i < k. The hypothesis k < (.(F) implies there exists no G T + satisfying the claim in (230) 
for i = k + 1. Since the claim holds for i = k, there exists t% G T+, such that for all TZ G U(F) with len(7£) > t%, we 
have 7r (tZ 1 ^ > k. The non-existence of and (231) implies, that for every t > to, there exists a string TZt G U(F), with 
\en(TZ t ) > t, such that, 

7T (TZl) <k + l (238) 

We now study the structure of the strings TZt for sufficiently large t. To this end, define to = max(to, t%). Then by the existence 
of t k T and (238) it follows that 7r (TZ\) = k for t > to. Hence for t > to, TZt is necessarily of the form: 

Kt = (/n, • • • , /* 4 , A*" 4 , ,/,-,,•••, /n en(TCt) _ t ' P *) (239) 

where ii, • • ■ , i t k_ G {0, 1}, such that, 7r ^7?.*^^ = k and ji, • • ■ , Jl e n(7j. t )-t G {^i !}• 

Now consider the sequence {TZ t } t> { Q and define the set J by J" = {7?.t, t > t'o}. Also, define J\ = {7?. G Sf k \ k(TZ) = 
fc}. Consider the function A*^ : J i — S- Ji by 

A*^ (TZ)=TZ tk r, \/TZeJ (240) 

Since the cardinality of Ji is finite and J is countably infinite, there exists TZ* G Ji, such that, the set (-A*^) ({7^*}) is 
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countably infinite. This in turn implies, that we can extract a subsequence {Tlt m } m>Q from the sequence {Tit} t> { , such that, 



Tlf = TV, Vm > (241) 



In other words, if TV is represented by TV =(/»•,••• i /»* , -F** J for some fixed i\, ■ ■ ■ ,i* k G {0, 1}, for every m, the 
string 7£t m is of the form: 



where ji, • • • , ji e n(ie t )-t G {°> are arbitrary. Denoting 



we have 



X ™ = fj 1 °fj2°---°fj leri(nt ^(P*), Vm (243) 
X(Kt m ) = A- o • ■ ■ o (X m )) (244) 



Since t m -> co as m -> oo, using Proposition 3.7 in a similar way, we have 

lim fl m ~^{X m ) = P* (245) 

m— >oo 

Noting that the function fq o ■ ■ ■ o : i — > is continuous (being the composition of continuous functions), we then 
have T 

lim M{TL tm ) = lim ft. o-.-ofr h {fl m ^{X m )) 

= A/" (72.*) (246) 

Thus the sequence {N (TZt m )} m >o in T converges to N (TV) as m — > oo. Hence, AT (TV) G .F as .F is closed and N (TV) 
is a limit point of J 7 . This in turn implies TV G U(T). Since 7r(7?*) = fe, this contradicts the hypothesis fc < ^(.F) and the 
claim in (230) holds for all 1 < i < £(T). This establishes the Lemma. 

■ 

Proof of Lemma 6.8 

Proof: Let a > be arbitrary and z G N be such that z > a. From Proposition 3.6 Assertion (iii), there exists a P * G R+, 
such that, 

M(TZ) < /,; (TC) (P*), VR6 5 P * (247) 

Define b G R+ such that \\f§ (P*)\\ < b and consider the compact set K a = {X G §+ | ||X|| < 6}. Also define the closed 
set, Fb, by F& = {X € §+ | ||X|| > &}. As per Lemma 6.5, define the set W(F b ) as 

U(J h ) = {TZe S P " | N(Tl) G Ft,} (248) 

We then have the following inclusion: 

U(F b ) C \jZ G S P * | tt(^) > «} (249) 

and hence 

ilFb) = inf 7r(7e) > z (250) 

Since F& is closed, by Lemma 6.5 there exists t? h G T+, such that, 

n(TV^) >z, \/TZe U(T b ) (251) 

To estimate the probability /i 7 (_ft^p), we now follow a similar set of arguments as used in Lemma 6.7. First note that we have 
by weak convergence: 

^(K°) < lim inf P^' P * ( P?' P ' G K°\ < lim inf P 7 ' P * ( P^ P * G F b ) (252) 
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For t G T+ define the sets J t p * = <Sf* nW(7j). For t > tjr 6 we have (see also (105)) 



P 7,P« ( P 7,P* e ^ = £ (1 _ < h*A (1 _ < J ( i _ ^ (2 53) 

A familiar set of arguments as in Lemma 6.7 yields the following from eqns. (252,253): 

from which we obtain 

limsup-— -r^(Ka ) < -z < -a (255) 

7t i ln(l - 7) 

Thus for every a > there exists a compact set A" a such that (111) is satisfied and the Lemma follows. ■ 

Appendix E 
Proof of Lemma 8.2 

Proof: By Theorem 4.2 it suffices to show that for all M > 0, 

inf /(X) = l{M) (256) 

XGBg(P') 

jnf I{X) = l+{M) (257) 
x 6 s° (P*) 

We prove (256) only, the proof of (257) being similar. 

The class A assumption implies (through the same line of arguments as in Proposition 3.6 Assertion (iii)) that 

< fo (K) (P*), V7^eS P * (258) 

By definition it follows that f (M) {P*) G Bg(P*) and hence 

inf I{X) < l{M) (259) 

xeBg (p*) 

Now assume on the contrary that (256) does not hold. Then by (259) we have 

inf I(X) < l(M) (260) 



Thus there exists 1Z G S p , such that, 

We then have from (258) 
which implies 



Af (TV) G b£(P*), tt(TZ) < t(M) - 1 (261) 
A/"^) < fo m {P") < /o (M)_1 (P*) (262) 
/o (M)_1 (P*) - P*|| > ||AA(7e) - P*|| > M (263) 



This contradicts the definition of t(M) which is the smallest non-negative integer k, such that, ||/q (P*) — P*\\ > M. We thus 
conclude that the claim in (256) holds. ■ 
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